Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[POC, Do not merge] input_chunk: split incoming buffer when it's too big #9385

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

braydonk
Copy link
Contributor

This PR is a proof of concept for mitigating the issue where a chunk can be too large when received from an input plugin.

Bug Explanation

When a large set of data is read at one time, all these records are appended into whichever chunk is the most recently active, and all the records are written at once. The check for the chunk size only happens after writing data to the chunk. So despite the chunk size being "limited" to 2M, this doesn't guarantee that it won't exceed that number. In this case, we could easily have a chunk that is right up close to the 2M limit, and then have loads of data written to it leading to an excessively large chunk that once encoded can exceed write limits of output plugin APIs.

Proposed Solution

This solution is an attempt to mitigate the problem without immense restructuring. The strategy is to examine the size of the incoming buffer, and if it exceeds the FLB_CHUNK_FS_MAX_SIZE (2M), the buffer is split into separate buffers that are under the max size, and are all appended to chunks separately. This is paired with a check when retrieving a new input chunk, which checks if appending the current buffer will exceed the chunk size limit, and if so a new chunk is created.

This solution is not perfect, but it was the best way I could find within my power (i.e. I don't consider major restructures to this code or chunkio to be "within my power").

Issues: #9374, #1938


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • [] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Add a configuration value for the storage chunk max size.

Signed-off-by: braydonk <braydonk@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants